GPU Kernels as Data-Parallel Array Computations in Haskell
نویسندگان
چکیده
We present a novel high-level parallel programming model aimed at graphics processing units (GPUs). We embed GPU kernels as data-parallel array computations in the purely functional language Haskell. GPU and CPU computations can be freely interleaved with the type system tracking the two different modes of computation. The embedded language of array computations is sufficiently limited that our system can automatically extract these computations and compile them to efficient GPU code. In this paper, we outline our approach and present the results of a few preliminary benchmarks.
منابع مشابه
A language for hierarchical data parallel design-space exploration on GPUs
Graphics Processing Units (GPUs) offer potential for very high performance; they are also rapidly evolving. Obsidian is an embedded language (in Haskell) for implementing high performance kernels to be run on GPUs. We would like to have our cake and eat it too; we want to raise the level of abstraction beyond CUDA code and still give the programmer control over the details relevant to kernel pe...
متن کاملA Language for Nested Data Parallel Design-space Exploration on GPUs
Graphics Processing Units (GPUs) o er potential for very high performance; they are also rapidly evolving. Obsidian is an embedded language (in Haskell) for implementing high performance kernels to be run on GPUs. We would like to have our cake and eat it too; we want to raise the level of abstraction beyond CUDA code and still give the programmer control over the details relevant kernel perfor...
متن کاملGPGPU Kernel Implementation using an Embedded Language: a Status Report
Obsidian is a domain specific language for general purpose computations on graphics processing units (GPUs) embedded Haskell. This report present examples of GPU kernels written in Obsidian as well as parts of the current implementation of Obsidian. The goal with Obsidian is to raise the level of abstraction for the programmer while not scarifying performance. The kind of decisions and tradeoff...
متن کاملProgramming Future Parallel Architectures with Haskell and Intel ArBB
New parallel architectures, such as Cell, Intel MIC, GPUs, and tiled architectures, enable high performance but are often hard to program. What is needed is a bridge between high-level programming models where programmers are most productive and modern parallel architectures. We propose that that bridge is Embedded Domain Specific Languages (EDSLs). One attractive target for EDSLs is Intel ArBB...
متن کاملGPU-UniCache: Automatic Code Generation of Spatial Blocking for Stencils on GPUs
Spatial blocking is a critical memory-access optimization to efficiently exploit the computing resources of parallel processors, such as many-core GPUs. By reusing cache-loaded data over multiple spatial iterations, spatial blocking can significantly lessen the pressure of accessing slow global memory. Stencil computations, for example, can exploit such data reuse via spatial blocking through t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009